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Abstract — We propose a general scheme to cre- 
ate time sequences that fulfill given constraints but 
are random otherwise. Significance levels for nonlin- 
earity tests are as usually obtained by Monte Carlo 
resampling. In a new scheme, constraints including 
multivariate, nonlinear, and nonstationary properties 
are implemented in the form of a cost function. 

I. INTRODUCTION 

There are two distinct motivations to use a nonlinear 
approach when analysing time scries data. It might be 
that the arsenal of linear methods has been exploited 
thoroughly but all the efforts left certain structures in 
the time series unaccounted for. It is also common 
that a system is known to include nonlinear compo- 
nents and therefore a linear description seems unsatis- 
factory in the first place. The latter reasoning is rather 
dangerous since the nonlinearity may not be reflected 
in a specific signal. In particular, we don't know if it 
is of any practical use to go beyond the linear approx- 
imation. Consequently, the application of nonlinear 
time series methods has to be justified by establishing 
nonlinearity in the time series data. 

This paper will discuss formal statistical tests for 
nonlinearity. First, a suitable null hypothesis for 
the underlying process will be formulated covering all 
Gaussian linear processes and, in fact, a class that is 
somewhat wider. We will then test this null hypothe- 
sis by comparing a nonlinear parameter with its prob- 
ability distribution for the null hypothesis which has 
to be estimated by a Monte Carlo resampling tech- 
nique. This procedure is known in the nonlinear time 
series literature as the method of surrogate data, see 
Ref. 0] . Thus we have to face a two-fold task. We have 
to find a nonlinear parameter that is able to actually 
detect an existing deviation of the data from a given 
null hypothesis and we have to provide an ensemble of 
randomised time series that accurately represents the 
null hypothesis. 

II. DETECTING NONLINEARITY 

Several quantities have been discussed that can be 
used to characterise nonlinear time series 101. For the 



purpose of nonlinearity testing we need such quan- 
tities that are particularly powerful in discriminat- 
ing linear dynamics and weakly nonlinear signatures. 
Traditional measures of nonlinearity are derived from 
generalisations of the two-point autocovariance func- 
tion or the power spectrum. One particularly useful 
third order quantity is X]!^=r-i-i('*n ~ Sn-r)'^ since it 
measures the asymmetry of a series under time re- 
versal. When a nonlinearity test is performed with 
the question in mind if nonlinear deterministic mod- 
eling of the signal may be useful, it seems most ap- 
propriate to use a test statistic that is related to a 
nonlinear deterministic approach . Widely used are 
test statistics which in some way or the other quan- 
tify the nonlinear predictability of the signal. Let 
Xn = (s,i-(m-i)r5 • ■ ■ , Sn) bc the scqucnce of time de- 
lay embedding vectors obtained from the scalar time 
series s„. The nonlinear prediction error can then 
be defined by \/Y^[xn+i — F{xn)Y- The prediction 
F over one time step is performed by averaging over 
the future values of all neighboring delay vectors Xn' 
closer to a?„ than e in m embedding dimensions. 

III. SURROGATE DATA 

Almost all measures of nonlinearity have in common 
that their probability distribution on finite data sets 
is not known analytically. It is therefore necessary 
to use a Monte Carlo resampling technique. Tradi- 
tional bootstrap methods use explicit model equa- 
tions that have to be extracted from the data. This 
typical realizations approach can be very powerful for 
the computation of confidence intervals, provided the 
model equations can be extracted successfully. As dis- 
cussed by Theiler and Prichard Q , the alternative ap- 
proach of constrained realizations is more suitable for 
the purpose of hypothesis testing we are interested in 
here. It avoids the fitting of model equations by di- 
rectly imposing the desired structures onto the ran- 
domised time series. However, the choice of possible 
null hypothesis is limited by the difficulty to impose 
arbitrary structures on otherwise random sequences. 
A general method has been recently proposed by one 
of the authors H. 



It is essential for the validity of the statistical test 
that the surrogate series are created properly. If they 
contain spurious differences to the measured data, 
these may be detected by the test and interpreted as 
signatures of nonlinearity. A simple case is the null hy- 
pothesis that the data consists of independent draws 
from a fixed probability distribution. Surrogate time 
series can be simply obtained by randomly shuffling 
the measured data. If we find significantly different 
serial correlations in the data and the shuffles, we can 
reject the hypothesis of independence. 

A. Fourier based methods 

A step towards more interesting null hypotheses is 
to incorporate the structures reflected by linear two- 
point autocorrelations. A corresponding null hypothe- 
sis is that the data have been generated by some linear 
stochastic process with Gaussian increments. The sta- 
tistical test is complicated by the fact that we don't 
want to test against one particular linear process only 
(one specific choice of ARM A coefficients), but against 
a whole class of processes. This is called a composite 
null hypothesis. The unknown coefficients are some- 
times referred to as nuissance parameters. There are 
basically three directions we can take in this situation. 
First, we could try to make the discriminating statis- 
tic independent of the nuissance parameters. This ap- 
proach has not been demonstrated to be viable for any 
but some very simple statistics. Second, we could de- 
termine which linear model is most likely realised in 
the data by a fit for the coefficients, and then test 
against the hypothesis that the data has been gener- 
ated by this particular model. Surrogates are simply 
created by running the fitted model. The main draw- 
back is that we cannot recover the true underlying pro- 
cess by any fit procedure. 

The null hypothesis of an underlying Gaussian lin- 
ear stochastic process can also be formulated by stat- 
ing that all structure to be found in a time series is 
exhausted by computing first and second order quan- 
tities, the mean, the variance and the autocovariance 
function. This means that a randomised sample can 
be obtained by creating sequences with the same sec- 
ond order properties as the measured data, but which 
are otherwise random. When the linear properties 
are specified by the squared amplitudes of the Fourier 
transform (that is, the periodogram estimator of the 
power spectrum), surrogate time series are readily cre- 
ated by multiplying the Fourier transform of the data 
by random phases and then transforming back to the 
time domain. 

The most obvious deviation from the Gaussian lin- 
ear process is usually that the data don't follow a 
Gaussian distribution. There is a simple generalisa- 
tion of the null hypothesis that explains deviations 
from the normal distribution by the action of a mono- 



tone, static measurement function: s„ = s(xn) where 
{xn} is a realisation of an ARMA process. We want to 
regard a time series from such a process as essentially 
linear since the only nonlinearity is contained in the — 
in principle invertible — measurement function s(-). 

The most commonly used method to create surro- 
gate data sets for this null hypothesis essentially at- 
tempts to invert s(-) by rescaling the time series {s„} 
to conform with a Gaussian distribution. The rescaled 
version is then phase randomised (conserving Gaus- 
sianity on average) and the result is rescaled to the em- 
pirical distribution of {sn}. Schreiber and Schmitz 
argue that for short and strongly correlated sequences, 
this algorithm can yield an incorrect test due to a 
bias towards a flat spectrum. They propose a method 
which iteratively corrects deviations in spectrum and 
distribution. Alternatingly, the surrogate is filtered to- 
wards the correct Fourier amplitudes and rank-ordered 
to the correct distribution. The accuracy that can be 
reached depends on the size and structure of the data 
and is generally sufficient for hypothesis testing. 

B. General scheme 

The above schemes are based on the Fourier ampli- 
tudes of the data which is appropriate in many cases. 
However, there remain some fiaw, the strongest be- 
ing the severely restricted class of testable null hy- 
potheses. In the general approach of Ref. [H, con- 
straints (e.g. autocorrelations) on the surrogate data 
are implemented by a cost function i?({s„}) which 
has a global minimum when the constraints are ful- 
filled. This cost function will be minimised by sim- 
ulated annealing |8|. Starting with a random per- 
mutation of the original time series, the surrogate is 
modified by exchanging two points chosen at random. 
The modification will be accepted if it yields a lower 
value for the cost function or else with a probability 
p = exp(— Ai?/r). The "system temperature" T will 
be lowered slowly to let the system settle down to a 
minimum. 

The constraint that the autocovariances of the sur- 
rogate C"(t) should be the same as those of the data 
C(t) can be realised by specifying the discrepancy as 
a cost function, for example 

E=Y,\C'{r)-C{T)\. (1) 

T = Q 

Now E{{sn}) is minimised among all permutations 
{s„} of the original time series {s„}. With an ap- 
propriate cooling scheme, the annealing procedure can 
reach any desired accuracy. 

Gonstrained randomisation using combinatorial 
minimisation is a very flexible method since in prin- 
ciple arbitrary constraints can be realised. It can be 
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Figure 1: Simultaneous measurements of breath and 
heart rates, upper and middle traces. Lower trace: a 
surrogate heart rate series preserving the autocorre- 
lation structure and the cross-correlation to the fixed 
breath rate series, as well as a spurious gap in the data. 
Auto- and cross-correlation together seems to explain 
some, but not all of the structure present in the heart 
rate series. 



quite useful to be able to incorporate into the surro- 
gates any feature of the data that is understood al- 
ready or that is considered to be uninteresting. The 
price for the accuracy and generality of the method is 
its high computational cost. 

IV. APPLICATIONS 

Heart rate — Let us give an example for the flex- 
ibility of the approach, a simultaneous recording of 
the breath rate and the instantaneous heart rate of a 
human subject during sleep see Fig. ^ An inter- 
esting question is, how much of the structure in the 
heart rate (middle) can be explained by linear depen- 
dence on the breath rate (upper). In order to answer 
this question, we need to make surrogates that have 
the same autocorrelation structure but also the same 
cross-correlation with respect to the fixed input signal, 
the breath rate. Accordingly, a constraint is used to 
fix the auto-covariance and the cross-covariance with 
the reference (breath) signal. While the linear cross- 
correlation with the breath rate explains the cyclic 
structure of the heart rate data, other features, in 
particular the asymmetry under time reversal, remain 
unexplained. Possible explanations include artefacts 
due to the peculiar way of deriving heart rate from 
inter-beat intervals, nonlinear coupling to the breath 
activity, nonlinearity in the cardiac system, and oth- 
ers. 

Financial data — Let us study 1500 daily returns 
(until the end of 1996) of the BUND Future, a de- 
rived financial instrument of the German stock market. 
(Data by courtesy of Thomas Schiirmann, WGZ-Bank 
Diisseldorf . ) The sequence (Fig. 0, upper) is nonsta- 



Figure 2: Nonstationary financial time series (top) 
and a surrogate (bottom) preserving the nonstation- 
ary structure quantified by running window estimates 
of the local mean and variance (middle). 



tionary in the sense that the local mean and variance 
undergo changes on a time scale that is long compared 
to the fiuctuations of the series itself. This property is 
known in the statistical literature as heteroscedastic- 
ity and modeled by the so-called GARCH and related 
models. Here, we want to avoid the construction of 
a parametric model but rather ask the question if the 
data is compatible with the null hypothesis of a cor- 
related linear stochastic process with time dependent 
local mean and variance. We can answer this question 
in a statistical sense by creating surrogate time series 
that show the same linear correlations and the same 
time dependence of the running mean and variance 
as the data and comparing a nonlinear statistic be- 
tween data and surrogates. Accordingly, a cost func- 
tion is set up to match the autocorrelation function 
up to five days and the moving mean and variance in 
sliding windows of 100 days duration. Using a time- 
asymmetry statistic, the null hypothesis could not be 
rejected, suggesting that the above characterisation of 
the data is operationally complete. 

Unevenly sampled data — Let us finally show 
how the new randomisation method can be used to 
test for nonlinearity in time series with time intervals 
of different sizes. Unevenly sampled data are quite 
common, examples include drill core data, astronom- 
ical observations or stock price notations. Most ob- 
servables and algorithms cannot easily be generalised 
to this case which is particularly true for nonlinear 
time series methods. Interpolating the data to equally 
spaced sampling times is not recommendable for a test 
for nonlinearity since one could not a posteriori dis- 
tinguish between genuine structure and nonlinearity 
introduced spuriously by the interpolation process. 

Consider a time series sampled at times {tn} that 
need not be equally spaced. The power spectrum can 
then be estimated by the Lomb periodogram P{uj), 




600000 700000 800000 

time [s] 



Figure 3: The down-sampled data set E with one cor- 
responding surrogate. Gaps of different sizes prevents 
reasonable interpolation. 

as discussed for example in Ref. |lO). For time series 
sampled at constant time intervals, the Lomb peri- 
odogram yields the standard squared Fourier transfor- 
mation. Except for this particular case, it does not 
have any inverse transformation, which makes it im- 
possible to use the standard surrogate data algorithms 
mentioned above. Therefore, we use the Lomb pe- 
riodogram of the data as a constraint for the surro- 
gates. It can be expressed as a cost function for exam- 
ple by: E = J^kii \P'ikuo) - P{kuJo)\. We use P at 
Nf equally spaced frequencies kujo, other choices are 
possible. Consider a series of the time-integrated 
intensity of light observed from a variable star, see 
Fig. ||. It consists of 17 parts with different num- 
bers of points, the time range of which may overlap or 
show gaps of up to 10000 s. Between gaps, the (down- 
sampled) data is evenly sampled with A = 120 s, the 
total number of points is = 2260. The linear null 
hypothesis was not rejected by the time reversibility 
statistic. One surrogate is shown in Fig. ||. 

V. DISCUSSION 

We have set up a statistical hypothesis test of nonlin- 
earity. How interesting its outcome is depends on the 
specific null hypothesis chosen. The most meaningful 
test can be performed if the null hypothesis is plausible 
enough so that we are prepared to believe it in the lack 
of evidence against it. In general, we will specify a set 
of observables we believe to be complete to describe 
the structure found in the data. The surrogates will 
then share these properties with the data and any sig- 
nificant discrepancy between data and surrogates can 
guide to a more complete understanding. 

Recent efforts on the generalisation of randomisa- 
tion schemes try to broaden the repertoire of null hy- 
pothesis we can test against. The hope is that we can 
eventually choose one that is general enough to be ac- 
ceptable if we fail to reject it with the methods we 



have. Still, we cannot prove that there isn't any struc- 
ture in the data beyond what is covered by the null 
hypothesis. From a practical point of view, however, 
there is not much of a difference between structure that 
isn't there and structure that is undetectable with our 
observational means. 
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